Goto

Collaborating Authors

 Hanover



Is the Dictionary Done For?

The New Yorker

Is the Dictionary Done For? The print edition of Merriam-Webster was once a touchstone of authority and stability. Then the internet brought about a revolution. Wars over words are inevitably culture wars, and debates over the dictionary have raged for as long as it has existed. Once, every middle-class home had a piano and a dictionary. The purpose of the piano was to be able to listen to music before phonographs were available and affordable. Later on, it was to torture young persons by insisting that they learn to do something few people do well. The purpose of the dictionary was to settle intra-family disputes over the spelling of words like "camaraderie" and "sesquipedalian," or over the correct pronunciation of "puttee." This was the state of the world not that long ago. In the late nineteen-eighties, Merriam-Webster's Collegiate Dictionary was on the best-seller list for a hundred and fifty-five consecutive weeks. Fifty-seven million copies were sold, a number believed to be second only, in this country, to sales of the Bible. There was good money in the word business.


AsymPuzl: An Asymmetric Puzzle for multi-agent cooperation

Cadet, Xavier, Koh, Edward, Chin, Peter

arXiv.org Artificial Intelligence

Large Language Model (LLM) agents are increasingly studied in multi-turn, multi-agent scenarios, yet most existing setups emphasize open-ended role-play rather than controlled evaluation. We introduce AsymPuzl, a minimal but expressive two-agent puzzle environment designed to isolate communication under information asymmetry. Each agent observes complementary but incomplete views of a symbolic puzzle and must exchange messages to solve it cooperatively. Using a diverse set of current-generation and open-source LLMs, we show that (i) strong models such as GPT-5 and Claude-4.0 reliably converge across puzzle sizes on the solution by sharing complete information in two turns, (ii) weaker models often ignore partner messages or over-correct their hypotheses, and (iii) feedback design is non-trivial: simple self-feedback improves success rates, while detailed joint feedback can hurt performance. These findings show that even in simple cooperative tasks, LLM communication strategies diverge and depend on the granularity of feedback signals. AsymPuzl thus provides a testbed for probing the limits of multi-turn cooperation and opens avenues for studying coordination mechanisms.


Augmented Runtime Collaboration for Self-Organizing Multi-Agent Systems: A Hybrid Bi-Criteria Routing Approach

Yang, Qingwen, Qu, Feiyu, Guo, Tiezheng, Liu, Yanyi, Wen, Yingyou

arXiv.org Artificial Intelligence

LLM-based multi-agent systems have demonstrated significant capabilities across diverse domains. However, the task performance and efficiency are fundamentally constrained by their collaboration strategies. Prevailing approaches rely on static topologies and centralized global planning, a paradigm that limits their scalability and adaptability in open, decentralized networks. Effective collaboration planning in distributed systems using only local information thus remains a formidable challenge. To address this, we propose BiRouter, a novel dual-criteria routing method for Self-Organizing Multi-Agent Systems (SO-MAS). This method enables each agent to autonomously execute ``next-hop'' task routing at runtime, relying solely on local information. Its core decision-making mechanism is predicated on balancing two metrics: (1) the ImpScore, which evaluates a candidate agent's long-term importance to the overall goal, and (2) the GapScore, which assesses its contextual continuity for the current task state. Furthermore, we introduce a dynamically updated reputation mechanism to bolster system robustness in untrustworthy environments and have developed a large-scale, cross-domain dataset, comprising thousands of annotated task-routing paths, to enhance the model's generalization. Extensive experiments demonstrate that BiRouter achieves superior performance and token efficiency over existing baselines, while maintaining strong robustness and effectiveness in information-limited, decentralized, and untrustworthy settings.